OpenDLib: A Digital Library Service System
نویسندگان
چکیده
This chapter introduces OpenDLib, a Digital Library Service System developed at ISTI-CNR for easing the creation and management of Digital Libraries. It discusses the motivations underlying the development of such a system and describes it by presenting (i) the characteristics of the huge kind of content it is capable to manage, (ii) the set of functions it natively provides its Digital Libraries with, (iii) the powerful and flexibility of its component-oriented architectural paradigm as a key feature for addressing different application scenarios, and (iv) the technologies the system development rely on. The authors hope that understanding the OpenDLib foundational principles will not only inform stakeholders and decision makers of the features implemented by this existing system, but also assist researchers and application developers in the understanding of the issues, and their possible solutions, that arises when building Digital Library systems aiming at serve such a broad class of application scenarios. I N T R O D U C T I O N Digital Libraries (DLs) are complex systems whose lifetime span approximately across the last fifteen years. Along this period a lot of focused and from-scratch developed systems have been realized to serve the needs of specific communities. These systems are tightly bounded to the characteristics of the single scenario they have been conceived for and results hard to be maintained along the time. In addition to them, a few general-purpose Digital Library Systems, i.e., software systems designed to be easily used for realizing Digital Libraries suitable for certain application contexts, have been conceived. Such second class of systems systematizes the techniques and the software needed to implement Digital Libraries with certain characteristics resulting in a product that communities can use to build their Digital Libraries. OpenDLib is an example of such a second class of Digital Library System with an architecture explicitly designed to support plug-and-play expansions. The design and development of OpenDLib was initiated in 2000 as a response to a pressing request for software that could enable different user communities to create their own DLs. We decided to design a general-purpose software that could be customized to meet the needs of the different application scenarios. At that time we called this software a Digital Library Service System, to stress that it is a system that manages digital library services and makes them publicly available (Castelli & Pagano, 20003). The role of OpenDLib is analogous to the role of a database management system for a database, i.e., it supports the creation and maintenance of distributed DLs. A DL can be created by instantiating OpenDLib appropriately and then either loading or harvesting the content to be managed. Our initial aim was to design a software tool that could provide a number of core DL functions on general content and that could easily be expandable. A DL is a very expensive resource, therefore it must be maintained over time even when new technologies that enable new functions are developed or when new kinds of usages are proposed. To satisfy this dynamic scenario, the DL must grow over time along several dimensions, e.g., services, metadata formats supported, hosting servers, user communities, etc. OpenDLib was designed to support this powerful notion of evolution. B A C K G R O U N D A N D R E Q U I R E M E N T S F O R A D I G I T A L L I B R A R Y S Y S T E M The systems developed in the digital library area until now are mainly dedicated to support digital repositories for satisfying the needs of single Institutions. Among such systems, usually called “digital repository systems”, Fedora (Lagoze, Payette, Shin & Wilper, 2005), DSpace (Tansley, Bass and Smith, 2003), and EPrints (Millington and Nixon, 2007) represent the most popular and adopted ones. Fedora is a Repository System specifically designed for storing and managing complex objects. It is implemented as a set of web services that provide full programmatic management of digital objects as well and search and access to multiple representations of such objects (Payette and Thornton, 2002). Fedora is particularly well-suited to work in a broad web service framework and act as the foundation layer for a variety of multi-tiered systems, service-oriented architectures, and end-user applications. DSpace is an open source digital repository software system for research Institutions (Tansley, Bass, Stuve, Branschofsky, Chudnov, McClellan & Smith, 2003). It enables organizations to capture and describe digital material using a submission workflow module, or a variety of programmatic ingest options; to distribute an organization’s digital assets over the web through a search and retrieval system; and to preserve digital assets over the long term. EPrints, now in its version 3, is among the first free repository software for building OAI-compliant repositories and probably one of the most diffused (On December 2007 there exist 240 known archives running this software). This new version is a major leap forward in functionality, aiming at simplifying the tasks of the various players (depositors, researchers, managers) involved in running and maintaining a Repository. Digital repository systems, however, only provide core functionalities, such as search, retrieval, and access to information objects, which are not able to meet application specific requirements as those digital libraries are usually requested to satisfy. For example, during its lifetime a DL may have to support additional functionalities for satisfy the needs of new organizations that join the DL offering their specific information content and their computers to host the system; or it may have to manage policies that regulate the access to content and services. OpenDLib (OpenDLib; Castelli & Pagano, 2002; Castelli & Pagano, 2003) has been conceived to meet these specific DL requirements. O P E N D L I B F E A T U R E S OpenDLib was created as a software toolkit for setting up digital libraries capable to satisfy multifaceted needs expressed by heterogeneous user communities. Because of this goal, it must be capable to support a powerful and flexible content model, a set of functions for appropriately dealing with such potentially rich content, and an architectural paradigm making it capable to be exploited in application scenarios having different deployment requirements. As regards the content, OpenDLib can handle a wide variety of information object types. This capability originates from the fact that it exploits a powerful and flexible Document Model named DoMDL (Candela, Castelli, Pagano & Simi, 2005). DoMDL has been designed to represent structured, multilingual and multimedia objects in a way that can be customized according to which content has to be handled. For example, it can be used to describe a lecture as the composition of the teacher presentation together with the slides and the summary of the talk transcript. Moreover the object may be disseminated as the MPEG3 format of the video or the SMIL object synchronizing its parts. In order to represent objects with completely different structures, DoMDL distinguishes four main aspects of document modelling and, using terms and definitions very similar to those coined in the IFLA FRBR model (IFLA, 1998), it represents these aspects through the following entities: Document, Edition , View , and Manifestation . The Document entity, representing the object as a distinct intellectual creation, captures the more general aspect of it. The Edition models an instance of the document entity along the time dimension. The preliminary version of a paper and the version published in the proceedings, are examples of successive editions of the same object. The View is the way through which an edition is perceived. A view excludes physical aspects that are not related to how a document is to be perceived. For example, the original edition of the proceedings of a workshop might be disseminated under two different views: (a) a “structured textual view” containing a “Preface” created by the conference chairs, and the list of thematic sessions containing the accepted papers, and (b) a “presentation view”, containing the list of the slides used in the presentations. The Manifestation models the physical formats by which an object is disseminated, e.g., the MPEG file containing the video recording of a lecture given at a certain summer school, or the AVI file of the same video. These entities are semantically connected by means of a set of possibly multiple relationships, i.e. there may be multiple editions of the same document, multiple views of the same edition and multiple manifestations of the same view. Each of the entities of the model has a set of attributes that specify the rights on the modelled document aspects. This enables, for example, to model possibly different rights on different editions, different access policies on different views or on different parts of the same view, and so on. The information objects of an OpenDLib digital library are organised into a set of virtual collections , each characterised by its own access policy. Authorised people can define new collections dynamically, i.e., by specifying membership criteria. In the same digital library, for example, it is possible to maintain a collection of grey literature accessible to all users and a collection of historical images accessible only to a specific group of researchers. Each collection is automatically updated whenever a new object matching the membership criteria is published in the digital library. From the functional point of view, the basic release of OpenDLib provides services to support the submission, description, indexing, search, browsing, retrieval, access, preservation and visualization of information objects. These objects can be submitted as files in a chosen format or as URLs to objects stored elsewhere. They can be described using one or more metadata formats. The OpenDLib search service offers different search options: free text or fielded (where fields can be selected from a variety of known metadata formats); single or cross-language; with or without relevance feedback. Information objects retrieved can be navigated over all their editions, versions, structures, metadata and formats. All the above services can be customised according to several dimensions such as, for example, metadata formats, controlled vocabularies, and browsable fields. OpenDLib is also equipped with other digital library specific services, such as the ones providing the enforcement of access policies on information objects and the management of “user-shelves” able to maintain information objects versions, resultsets, session results, and other information. In addition, a number of administration functions are also given to support preservation of objects, objects reviewing process, and handling of users and user group profiles. From the architectural point of view, OpenDLib adopts a component-based approach. This application development approach is the most suitable to realise the OpenDLib goal because: (i) the system is assembled from discrete executable components which are developed and deployed somewhat independently of one another, potentially by different players; (ii) the system may be upgraded with smaller increments, i.e. by upgrading only some of the constituent components. In particular, this aspect is one of the key points to reach interoperability, as upgrading the appropriate constituents of a system makes it able to interact with other systems; (iii) components may be shared by systems; this creates opportunities for reuse that heavily contributes to lowering the development and maintenance costs and the time to market; and (iv) though not strictly related to their being component-based, component-based systems tend to be distributed. OpenDLib implements this approach by realising a Service-oriented Architecture so it consists of an open federation of services that can be distributed and replicated on different hosting servers. By exploiting this feature, OpenDLib supports three kinds of dynamic expansions: (i) new classes of services can easily be added to the federation; (ii) new instances of a replicated or distributed service can be deployed on either an existing or a new hosting server; (iii) the configurations of the services can be modified so that they can handle new object types, new metadata formats and support new usages. Thus the architectural configuration, chosen when the digital library is set up, can be changed later to satisfy new emerging needs. For instance, a replication of an Index instance can be created to reduce workloads when the number of search requests exceeds an established threshold, whereas an Index instance, able to serve queries in a language not previously supported can be added to satisfy the needs of a new community of users. All the above dynamic updates in the configuration of the federation can be done on the fly, i.e. without switching off the digital library. This architectural design provides a great flexibility in the management of the digital library, so that it can be managed and hosted either by a single or by a multitude of organisations. For example, an organisation can decide to maintain an instance of the repository service, in order to have local control over its own documents, but to share all the other services with other institutions. The OpenDLib architecture has been designed to be highly interoperable with other digital libraries. In particular, an OpenDLib library can act both as an OAI-PMH data provider and as an OAI-PMH service provider (Lagoze and Van de Sompel, 2001). This implies that the metadata maintained by an OpenDLib digital library can be open to other digital libraries and, vice-versa, the OpenDLib services can access the metadata published by any other OAI-PMH compliant digital library. From an implementation point of view, OpenDLib is an HTTP (web) server-based software system, requiring only open source technologies and running on several of the major operating systems, such as Solaris, Linux, SCO, Digital Unix, Unix V, and AIX. OpenDLib is distributed as an Apache Server Module that contains all the modules belonging to the OpenDLib kernel. At activation time, it loads the set of self-installing modules representing DL Web Services in accordance with the configuration instructions. DL service modules are either function libraries, or object libraries used to define a class and its methods. The system adopts XML as a standard for internal (service to service) and external exchange of messages. An HTTP-based communication protocol, named OpenDLib Protocol (OLP), has been designed in order to regulate the communication among the services. The OLP messages can be sent as REST or SOAP requests while a set of APIs have been defined to facilitate its exploitation in other existing systems. C O N C L U S I O N S OpenDLib is now an operational system that has been used for building a number of DLs. Each of these DLs has its own specific distributed architectural configuration that reflects the needs of the application scenario where it operates. Thanks to its expandability OpenDLib proved to support the creation of very different DLs by dynamically and progressively aggregating resources provided by many disperse organizations. Recently OpenDLib has been extended. Its new version, OpenDLibG (Candela, Castelli, Pagano, & Simi, 2006) makes it able to exploit the storage and processing capabilities offered by a Grid Infrastructure. Thanks to this extension an OpenDLib digital library is now able to handle a much wider class of information objects, namely objects requiring huge storage capabilities like raw data and particular types of images, videos, and 3D objects. Further, it is also able to process on-demand such data to generate new information as the result of computational intensive elaborations that OpenDLib automatically distributes to exploit the available computational resources. This extension has confirmed the goodness of the system, its openness and the power and flexibility of the DoMDL.
منابع مشابه
Moving Digital Library Service Systems to the Grid
The architecture of a digital library service system strongly influences its capabilities. In this paper we report our experience with the OpenDLib system, which is based on a service-oriented architecture, and we describe how, in the attempt to better satisfy the user requirements, we decided to develop a digital library service-oriented infrastructure on Grid. We also briefly introduce this i...
متن کاملSemantic Association Analysis in Ontology-Based Information Retrieval
Chapter I OpenDLib: A Digital Library Service System ...............................................................................................1 Leonardo Candela, Istituto di Scienza e Tecnologie dell’Informazione “A. Faedo” (ISTI-CNR), Italy Donatella Castelli, Istituto di Scienza e Tecnologie dell’Informazione “A. Faedo” (ISTI-CNR), Italy Pasquale Pagano, Istituto di Scienza e Tecnologie...
متن کاملTowards Building an Open Digital Library for Instructional Design that Facilitates Reflective e-Instruction
Computer-based instruction, online or offline, which we will be referring at as einstruction, provides for the development of new flexible pedagogical frameworks that will offer opportunities for open worldwide lifelong instruction. We claim that these are to be stored in instructional digital libraries in order to be accessible to anyone, anytime, anywhere. While many advances have been made i...
متن کاملInvestigating the Level of Observing the Evaluation Criteria for User Interface in library services providing to the blind and deaf users in the word
Purpose: Digital library user interfaces has a determining role in desirable performance of this kind of libraries. Digital Library service providers to the blind and deaf users will have their best performance when the users (deaf and blind users) could have a proper interaction with them. This study aims to evaluate and analyze the criteria related to user interface in digital libraries servi...
متن کاملA symbol-based fuzzy decision-making approach to evaluate the user satisfaction on services in academic digital libraries
Academic libraries play a significant role in providing core services that include research, teaching and learning. Usersatisfaction is an important indicator for evaluating the performance of library service. This paper develops a methodfor measuring the user satisfaction in a group decision-making environment. First, the performance of service isevaluated by using questionnaire survey. The sc...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2002